On the Communication Complexity of Approximate Nash 

Equilibria* 



Paul W. Goldberg^, Arnoud Pastink^ 

^ University of Liverpool 
Dept. of Computer Science 
Ashton Street, Liverpool L69 3BX, U. K. 
P . W . GoldbergQliverpool .ac.uk 

^ Utrecht University 
Department of Information and Computing Science 
P.O. Box 80089, 3508TB Utrecht, The Netherlands 

A . J . Past inkSuu . nl 

February 18, 2013 



Abstract 

We study the problem of computing approximate Nash equihbria of bimatrix games, in a 
setting where players initially know their own payoffs but not the payoffs of the other player. 
In order for a solution of reasonable quality to be found, some amount of communication needs 
to take place between the players. We are interested in algorithms where the communication 
is substantially less than the contents of a payoff matrix, for example logarithmic in the size of 
the matrix. When the communication is polylogarithmic in the number of strategies n, we show 
how to obtain e-approximate Nash equilibrium for e approximately 0- 438, and for well-supported 
approximate equilibria we obtain e approximately 0- 732. For one-way communication wc show 
that e = i is achievable, but no constant improvement over i is possible, even with unlimited 
one-way communication. For well-supported equilibria, no value of e < 1 is achievable with 
one-way communication. When the players do not communicate at all, e-Nash equilibria can 
be obtained for e = |, and we also give a lower bound of slightly more than i on the lowest 
constant e achievable. 



1 Introduction 

Algorithmic game theory is concerned not just with properties of a solution concept, but also how 
that solution can be obtained. It is considered desirable that the outcome of a game should be 
"easy to compute", which is typically formalised as polynomial-time computability, in the algo- 
rithms community. In that respect the PPAD-completeness results of [H [2] are interpreted as a 
"complexity-theoretic critique" of Nash equilibrium. Following those results, a line of work ad- 
dressed the problem of computing e-Nash equilibrium, where e > is a parameter that bounds a 

*A preliminary version of this paper appeared in the Proceedings of the 5th SAGT. The first author was supported 
by EPSRC Grant EP/G069239/1 "Efficient Decentralised Approaches in Algorithmic Game Theory" 
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player's incentive to deviate, in a solution. Thus, e-Nash equilibrium imposes a weaker constraint 
on how players are assumed to behave, and an exact Nash equilibrium is obtained for e = 0. The 
main open problem is to find out what values of e admit a polynomial-time algorithm. Below we 
summarise some of the progress in this direction. 

Beyond the existence of a fast algorithm, it is also desirable that a solution should be obtained 
by a process that is simple and decentralised, since that is likely to be a better model for how players 
in a game may eventually reach a solution. In that respect, most of the known efficient algorithms 
for computing e-Nash equilibria are not entirely satisfying. They take as input the payoff matrices 
and output the approximate Nash equilibrium. If we try to translate such an algorithm into real 
life, it would correspond to a process where the players pass their payoffs to a central authority, 
which returns to them some mixed strategies that have the "low incentive to deviate" guarantee. In 
this paper we aim to model a setting where players perform individual computations and exchange 
some limited information. We revisit the question of what values of e are achievable, subject to 
this restriction to more "realistic" algorithms. 

There are various ways in which one can try to model the notion of a decentralised algorithm; 
here we consider a general approach that has previously been studied in [H |18] in the context 
of computing exact Nash equilibria. The players begin with knowledge of their own payoffs but 
not the payoffs of the other players; this is often called an uncoupled setting (see Section 11.2.41 for 
an overview). An algorithm involves communication in addition to computation; to find a game- 
theoretic solution, a player usually has to know something about the other players' matrices, but 
hopefully not all of that information. We study the computation of e-Nash equilibria in this setting, 
and the general topic is the trade-off between the amount of communication that takes place, and 
the value of e that can be obtained. In uncoupled settings, there are natural dynamic processes 
that converge to correlated equilibria, but the results are less positive for exact Nash equilibria, so 
this paper can be seen as an investigation into approximate Nash equilibrium as an alternative to 
correlated equilibrium, solution concept. 

1.1 Definitions 

We consider 2-player games, with a row player and a column player, who both have n pure strategies. 
The game {R, C) is defined by two n x n payoff matrices, R for the row player, and C for the column 
player. The pure strategies for the row player are his rows and the pure strategies of the column 
player are her columns. If the row player plays row i and the column player plays column j, 
the payoff for the row player is Rij, and Cij for the column player. For the row player a mixed 
strategy is a probability distribution x over the rows, and a mixed strategy for the column player 
is a probability distribution y over the columns, where x and y are column vectors and (x, y) is a 
mixed strategy profile. The payoffs resulting from these mixed strategies x and y are yi^ Ry for the 
row player and x"'"Cy for the column player. 

A Nash equilibrium is a pair of mixed strategies (x* , y* ) where neither player can get a higher 
payoff by playing another strategy assuming the other player does not change his strategy. Because 
of the linearity of a mixed strategy, the largest gain can be achieved by defecting to a pure strategy. 
Let Bj be the vector with a 1 at the ith position and a at every other position. Thus a Nash 
equilibrium (x*,y*) satisfies 

Vi = 1 • • • n ejRy* < {x*fRy* and (x*)'^Cei < {x.*fCy*. 

We assume that the payoffs of R and C are between and 1, which can be achieved by rescaling. 
An e-approximate Nash equilibrium (or, e-Nash equilibrium) is a strategy pair (x*,y*) such that 
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each player can gain at most e by unilaterally deviating to a different strategy. Thus, it is (x*,y*) 
satisfying 

Vi = 1 • • • n ejRy* < {x* fRy* + e and (x*)'^Ce, < {^*fCy* + e. 

We say that the regret of a player is the difference between his payoff and the payoff of his best 
response. 

The support of a mixed strategy x, denoted Supp(x), is the set of pure strategies that are played 
with non-zero probability by x. An approximate well-supported Nash equilibrium strengthens the 
requirements of a mixed Nash equilibrium. For a mixed strategy y of the column player, a pure 
strategy i € [n] is an e-hest response for the row player if, for all pure strategies i' G [n] we have: 
ejRy > eJ,Ry — e. We define e-best responses for the column player analogously. A mixed strategy 
profile (x, y) is an e -well- supported Nash equilibrium (e-WSNE) if every pure strategy in Supp(x) 
is an e-best response against y, and every pure strategy in Supp(y) is an e-best response against x. 

The communication model: Each player p G {r, c} has an algorithm Ap whose initial input 
data is p's n x n payoff matrix. Communication proceeds in a number of rounds, where in each 
round, each player may send a single bit of information to the other player. During each round, 
each player may also carry out a polynomial (in n) amount of computation. (A natural variant 
of the model would omit the restriction to polynomial computation. Indeed, our lower bounds on 
communication requirement do not depend on computational limits.) At the end, each player p 
outputs a mixed strategy Xp. We aim to design (pairs of) algorithms {Ar,Ac) that output e-Nash 
strategy profiles (xr,Xc), and are economical with the number of rounds of communication. This 
is similar to the mixed Nash equilibrium procedure of [18] , here applied to approximate rather than 
exact equilibria. 

Notice that given @{n?) rounds of communication, we can apply any centralised algorithm A 
by getting (say) the row player to pass additive approximations of all his payoffs to the column 
player, who applies A and passes to the row player the mixed strategy obtained by A for the row 
player. (The quality of the e-Nash equilibrium is proportional to the quality of of the additive 
approximations used.) For this reason we focus on algorithms with many fewer rounds, and we 
obtain results for logarithmic or poly logarithmic (in n) rounds. 

We also consider a restriction to one-way communication, where one player may send but not 
receive information. 

1.2 Related Work 

We start by reviewing some algorithms that we adapt to the communication-bounded setting. Then 
we review the background work on communication complexity, and related work in computing Nash 
equilibria, including learning of equilibria in uncoupled settings. 

1.2.1 Algorithms for approximate equilibria 

In recent years a number of algorithms [24^ [9l \T0\ [H [31] have been developed that compute (in 
polynomial time) e-Nash equilibria for various values of e. Of these, Tsaknakis and Spirakis [31] 
obtain the best (smallest) value of e, of approximately 0- 3393. The more demanding criterion of 
well-supported e-Nash equilibrium, disallows a player from allocating positive probability to any 
pure strategy whose payoff is more than e worse than the best response. Progress on polynomial- 
time algorithms for this solution concept has been more limited; at this time the lowest e that can 
be guaranteed by a polynomial-time algorithm is only slightly less than | [12], obtained via a mod- 
ification of a |-approximation algorithm of Kontogiannis and Spirakis [25]. Prior to that, [9] gave 
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a |-approximation algorithm, that is contingent on a graph-theoretic conjecture. In this context, 
our 0- 732-approximation algorithm substantially improves on the result of [S], both in terms of ap- 
proximation quality and a more demanding model (communication-bounded algorithms). However, 
we do not know how to obtain the better approximation quality of [25|. [12] in the communication- 
bounded setting. Next we discuss two of the earlier algorithms in the literature whose ideas we use 
here. 

DMP-algorithm: The DMP-algorithm [9] works as follows to achieve a ^-approximate Nash 
equilibrium. The algorithm picks a arbitrary row for the row player, say row i. Let j G argmax^, Cij'. 
Let k S argmax^/ Rk'j- So j is a pure-strategy best response for the column player to row i and 
A; is a best response strategy for the row player to column j. The strategy pair (x*,y*) will now 
be X* = |ei + iefc and y* = ej. With this strategy pair the row player plays a best response with 
probability ^ to a pure strategy of the column player and the column player has a pure strategy 
that is with probability | a best response. 

The DMP-algorithm is well-adapted to the limited-communication setting. Suppose the row 
player uses z = 1 as his initial choice of row. The column player needs to tell the row player 
her value of j, a communication of O(logn) bits. No further communication is needed. Notice 
moreover that the communication is all one-way; the row player does not need to tell the column 
player anything. 

Subsequent algorithms for computing e-Nash equilibria cannot so easily be adapted to a limited- 
communication setting, but we can use some of the ideas they develop, to obtain values of e below 
^ in this setting. 

An algorithm of Bosse et al. [I] : The algorithm presented in [1] can be seen as a modification 
of the DMP-algorithm and achieves a 0- 38197-approximate Nash equilibrium. Listead of a player 
allocating some probability to some arbitrary pure strategy, the algorithm starts with the row player 
allocating some probability to the row-player strategy x belonging to the Nash equilibrium of the 
zero-sum game {R — C,C — R). In solving the zero-sum game efficiently we apply the connection 
of zero-sum games with linear programming j29[ [6l [23] . If the (mixed) strategy profile (x, y ) that 
constitutes a Nash equilibrium of {R — C,C — R) gives a 0- 38197-approximate Nash equilibrium 
for (R,C), this solution is used. Otherwise, the column player plays a best response ej to x and 
the row player plays a mixture of x and e^, where is a best response to the strategy ej of the 
column player. ([T] goes on to improve the worst-case performance to a 0- 36395-approximate Nash 
equilibrium.) 

Notice that this algorithm cannot be adapted in a straightforward way to our communication- 
bounded setup, since it requires a computation using knowledge of both matrices. The starting- 
point of our algorithms of Section [J] is the players separately solving {R, —R) and (— C, C). 

1.2.2 Communication Complexity 

The "classical" setting of communication complexity is based on the model introduced by Yao 
in [32]. We will follow the representation in [26]. We have two ag entf0, one holding an input 
X G {0, 1}"" and the other holding an input y S {0, 1}". The objective is to compute /(x, y) E {0, 1}, 
a joint function of their inputs. The computation of /(x, y) is done via a communication protocol 
V. During the execution of the protocol, the agents send messages to each other. While the 

^We use agents instead of players to avoid confusion, the communication does not liave to be between the players 
of the game. 
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protocol has not terminated, the protocol specifies what message the sender should send next, 
based on the input of the protocol and the communication so far. If the protocol terminates, it 
will output the value /(x, y). A communication protocol V computes / if for every input pair 
(x, y) G {0, 1}" X {0, 1}", it terminates with the value /(x, y) as output. 

The communication complexity of a communication protocol V for computing /(x, y) is the 
number of bits sent during the execution of V, which we denote by CC{V , f,x,y). The commu- 
nication complexity of a protocol V for a function / is defined as the worst case communication 
complexity over all possible inputs for (x, y) S {0, 1}" x {0, 1}", which we denote by CC{V,f): 

CC{VJ) = ^ . .™ . . CC{VJ,^,y) 

{x,y)e{0,l}"x{0,l}" 

The communication complexity of a function / is the minimum over all possible protocols: 

CC{f) = TmnCC{VJ) 

1.2.3 Existing results on communication complexity of Nash equilibria 

There are a few results concerning the communication complexity of Nash equilibria. Conitzer and 
Sandholm [3] show a lower bound on the communication complexity for 2-player games of finding 
a pure Nash equilibrium of where n is the number of pure strategies for each player. They 

also show a simple algorithm that finds a pure Nash equilibrium (if it exists) in O(n^). They do 
not extend their analysis to mixed Nash equilibria; their focus is on searching for a pure Nash 
equilibrium (if one exists), in contrast with the existence of a mixed Nash equilibrium, which is 
guaranteed [28]. For unrestricted bimatrix games, it can be seen that the communication complexity 
of finding an exact equilibrium is r2(n^jl. That observation leads to the question addressed here, 
of whether approximate equilibria have lower communication complexity. 

Also related to this paper, Hart and Mansour [18] study the communication complexity of 
uncoupled equilibrium procedures, (discussed in more detail below in Section ll.2.4p in the context 
of multiplayer, binary action games. The emphasis is on lower bounds on the communication 
requirement. Analogously to the Q.{v?) communication needed for pure or mixed Nash equilibrium 
that we noted above, they obtain a lower bound of 0(2**) (where s is the number of players) on 
the communication needed to find an exact mixed equilibrium, or determine the existence of a 
pure one. (Note that in their setting, each player has a payoff matrix of size 2*, so that essentially 
all the payoffs may need to be communicated.) On the other hand, they obtain a polynomial 
upper bound on the communication required to find a correlated equilibrium, discussed further 
below. Their methods do not seem to be applicable in an obvious way to approximate equilibria. 
For example, the lower bound for computing a mixed equilibrium involves a game whose solution 
requires probabilities having exponentially large descriptions, which would not be needed in the 
context of approximate equilibria. 

1.2.4 Uncoupled Learning of Equilibria 

An extensive literature studies uncoupled procedures for finding game-theoretic solutions. The 
terminology "uncoupled" is introduced in [20] ; it refers to settings where each player knows his 
own (but not the others') utility function. Then, there is a sequence of rounds (a.k.a. time steps. 



^Consider a game where there is a unique, fully- mixed Nash equilibrium. If the payoffs are perturbed slightly, the 
resulting equilibrium, for (say) the row player, will be affected in a non-trivial way by all the perturbations of the 
column player's payoffs. This immediately results in the requirement of ^lijn?) communication. 
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or periods), in which each player plays a strategy, and receives the payoff resulting from the entire 
strategy profile. Our setting of communication complexity is related to this, in that each player 
can use his choice of action (in a round) to transmit information. The main difference is that here, 
we do not assume a "rational" choice of action where a player tries to maintain his payoff over time 
by predicting the choices of his opponents. In our set-up, player communicate some information 
over a (hopefully short) sequence of rounds, and afterwards promise to use certain mixed strategies. 
Our interest is in both upper and lower bounds on the required length of the sequence. As noted 
in Conitzer and Sandholm [5, lower-bound type results generally ignore strategic considerations, 
which perhaps helps to justify our own inattention to rationality in this paper. 

In the context of uncoupled search for Nash equilibrium, Hart and Mas-Colell [20] show that 
when players do not remember the history of play, it may be impossible to reach Nash equilibrium. 
Note that the obstacle is informational rather than due to rationality of the players. A subsequent 
paper [21] analyses how much of the history of play needs to be recalled by the players. In the case 
of mixed (approximate) Nash equilibria, the approach is to test many probability distributions is a 
search for one that constitutes an approximate equilibrium; a large number of rounds is required to 
achieve this. Foster and Young [15] show how this can be achieved in a "radically uncoupled" setup, 
where a player does not directly observe the opponents' behaviour, but observed it indirectly via 
the payoffs he obtains. Again, a very large number of rounds are required to find an approximate 
equilibrium. Daskalakis et al. [7j study negative results, namely failure to converge to Nash equi- 
librium, for standard multiplicative weights update algorithms, in the context of bimatrix games. 
Their results consider three variants of uncoupled dynamics. 

There are more natural learning algorithms that converge (in various senses) to the weaker solu- 
tion concept of correlated equilibrium (e.g. Foster and Vohra |14j . Hart and Mas-Colell [H]). When 
we relax our objective from approximate Nash equilibrium to approximate correlated equilibrium, 
then learning can take place with a sublinear number of rounds, from a straightforward application 
of no-regret learning algorithms. The idea is applied in Theorem 30 of [18]. In particular, we 
equip each playeiH with a no-regret algorithm, and suppose that at each round it duly selects (and 
outputs) a pure strategy, which requires log(n) bits to output. Indeed, Theorem 17 of [18] shows 
how exact correlated equilibrium may be found in a polynomial number of rounds. 

Foster and Young [15] point out as motivation for uncoupled learning rules, that uncoupledness 
prevents a learning rule from behaving like a centralised algorithm and just constituting a theory of 
equilibrium selection. In this paper we similarly avoid the possibility of implementing a centralised 
algorithm, though restricting to a sublinear number of rounds of communication, so that it is 
impossible for one player to reveal all (or even a large fraction) of his payoffs to the other player. 

1.3 Overview of our results 

For general nxn games we show the following bounds on the obtainable quality of an approximate 
Nash equilibrium if we fix the amount of communication allowed. We start by considering a version 
where no communication is allowed. Theorem [1] gives a simple way to find a |-Nash equilibrium, 
in this setting. Theorem [2] identifies a corresponding lower bound of slightly more than ^. For 
one-way communication we exhibit (Theorem [3]) a lower bound of ^ — o(-^). The DMP-algorithm 

can be implemented as an algorithm with one-way communication and gives a ^-approximate 
Nash equilibrium. Therefore the constant ^ in the lower bound of Theorem [3] is tight, in this 
context. In Section [4.11 we show how to compute a 0- 438-Nash equilibrium using polylogarithmic 
communication. In Section [5] we discuss the significance of the results, along with open problems. 

^Indeed, there may be any number of players, not just 2. 
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2 Approximate Nash Equilibria with no Communication 



The simplest way to restrict communication is to disallow it entirelyO That means that for each 
player p E {r, c}, we must find a function fp from p's payoff matrix to a mixed strategy, such that 
for all pairs of matrices {R, C), we have that {fr{R), fc{C)) is an e-Nash equilibrium. In this section 
we show that the achievable value of e lies somewhere between 0- 501 and |. The | upper bound 
is achieved via a simple algorithm (differing from the |-approximation algorithm of [24], in terms 
of the solution it finds). Theorem [2] presents the lower bounds of 0- 501. 

Theorem [3] in Section [3] furnishes a lower bound of \, even when one-way communication is 
permitted, and has a simpler proof (the proof is similar to Case 2 in the proof of Theorem [2]). This 
raises the question: why bother to include a complicated proof (specific to the communication- 
free setting) whose result is only a small improvement (over the one-way communication setting)? 
The reason is that we rule out the possibility that ^ is in fact the answer, and as we discuss in 
the conclusions (Section [S]), ^ seems to arise frequently as a barrier to progress in the study of 
algorithms for approximate Nash equilibria, so it is informative to rule out that possibility. Our 
lower bound of 0- 501 could be increased slightly by tweaking the parameters of the proof, but we 
believe that the resulting progress would be incremental. 

Theorem 1 It is possible to guarantee a j- approximate Nash equilibrium, even if there is no 
communication between the players. 

Proof. Each player allocates probability ^ to his first pure strategy, and ^ to his best response 
to the other player's first pure strategy. In detail, let i G argmaxj/ Ri'i and let j G argmaxj/ Cy. 
The approximate Nash equilibrium will be fr{R) = ^ei + ^ej and fc{C) = ^ei + ^ej. 

Let i' be a best pure strategy response of the row player to fc{C). Then his incentive to deviate 

is 

(2^*'^ + 2^i'i^ ~ (4-^11 + 4^1^' + 4-^*1 + 4^^*-'') 

- (4-^*'^ ^ 2^*'^) ~ (4^^^ I^^-' ^ 4^*^') - 4^''^ ^ 2^*'^ ~ 4 2 ^ 4 

where the first inequality holds because i was a best response to column 1 (so Rn > Ri'i) and the 
next inequalities hold because payoffs lie in [0, 1]. The same kind of argument holds for the column 
player. This proves the theorem. □ 

The following lemma provides a construction that is used in Theorems [2] and [3j 

Definition 1 Let be a matrix with n columns and (^) rows, where k = [^/n\ and a row consists 
of k I's and (n — k) O's. Every row is distinct, so the (^) rows are all the possible sequences with 
k I's in a row of length n. 

Lemma 1 Suppose we have a bimatrix game where the row player's payoff matrix is M„ (as in 
Definition [ip. Let x be a mixed strategy for the row player. Then, there exists a column of 
such that if the column player uses any mixed strategy y that allocates probability p to that column, 
then the row player's regret is at least p — 0{1/ y/n). 

^This is to some extent inspired by earlier work of the first author [T6] that studied an approach to pattern classi- 
fication in which the set of observations of each class must be processed by an algorithm that proceeds independently 
of the corresponding algorithms that receive members of the other classes. 
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Proof. The rows of M„ contain I's in a fraction ^ of their entries. By symmetry, so do the columns, 
thus every column contains | • ()!) I's and (1 — |) • ('^) O's (recall k = l\/n\). 

X assigns a probability to each row of M„. Define an unnormalised probability distribution <I> 
over the columns as follows. Let be the probability that a 1 will be in column j of M„, given a 
row sampled from x. Note that ^{j) < 1, with equality when every row that is played with positive 
probability has a 1 in column j. Because every row contains /c I's, the sum of over all values j will 
sum to k: Yl]=i ^0) = ^• 

We define column m to be one with a lowest value of <1>: m G argmin^- We choose m 

to be the special column in the statement of the Lemma, and we suppose that the column player 
allocates probability p to m. 

Since the sum over all values ^{j) is k and there are n columns, this means that ^>(m) < ^. 
When column m is played (and we assume it is played with probability p) it gives the row player 
a payoff of with a probability of at least 1 — ;|. 

We now consider the row player's strategy x and construct an improved response x* as follows. 
X* will differ from x in the following way. For every row i we see if its m-th entry is a 1. If this 
is the case, we do not change anything. If instead its m-th entry is a 0, we do the following: look 
at the entries where there is a 1 in row i. Of all the entries where there is a 1, we select the one 
to which the column player's distribution y gives the lowest probability, say entry a. (i.e. choose 
column a £ argmiuj . f^,j^^[j^jj^iy[j].) Now we move all the probability allocated to row i by x, to 
the row of M„, that instead has a in entry a and a 1 in entry m, and is otherwise the same as i. 

The probability on entry a is defined as the smallest among all the entries where row i has a 
1. We can bound the probability that is allocated to this entry by distribution y. A probability 
at least p is given to column m, so a probability of 1 — p can be distributed over the remaining 
columns. The column containing entry a has the smallest probability among at least k columns, 
so the probability given to column a is at most 

The result of this construction of x* from x is that every row that is played with positive 
probability by x* will have a 1 in the m-th entry. There is a probability at least {1 — ^) that a row 
sampled from x does not have a 1 in the m-th entry. This means that the increase in payoff from 
replacing x with x* is at least 

Noting that k = ly/n\ gives us the desired result. □ 

We use the following technical extension in the proof of Theorem [2l It is a straighforward 
corollary of Lemma [TJ 

Corollary 1 Suppose we have a bimatrix game where the row player's payoff matrix R has Mn (as 
in Definition as a submatrix. Suppose furthermore that all rows of R that do not intersect Mn 
pay the row player zero, and for all rows of R that intersect Mn, all columns that are not columns 
of Mn pay the row player 1 . 

Let X he any mixed strategy for the row player that allocates probability at least pr to rows that 
do not intersect Mn- Let y he a mixed strategy for the column player, that allocates at least pc to 
columns that do not intersect Mn, hut allocates probability p to some column i intersecting Mn- 
Then, there exists a choice of columni such that the row player's regret is at least p—0{l/ ^/n)+prPc- 

Proof. Suppose x is modified as follows. For rows that intersect M„, modify their probabilities 
according to Lemma [TJ For other rows, set their probability to 0, and transfer their probability to 
an arbitrary row that has payoff 1 when column i is played. 
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This change increases hy p — 0{l/^/k), the payoffs to the row player resulting from the column 
player playing columns containing Af„. Note that this gain is not conditioned on the column player 
playing columns intersecting M„; it is an absolute gain. In more detail, for rows intersecting Af„, 
a fraction 1 — ^ of them (w.r.t. probability measure x) have their payoff raised by at least p— 
For rows not intersecting Mn, their payoffs are raised by at least p. 

There is an additional increase to the row player's payoff due to the transfer of probability from 
rows not intersecting M„ to rows intersecting M„, in the event that the column player plays a 
column not containing M„. In this case the payoffs increase from to 1, resulting in an additional 
payoff to the row player of (at least) Pr-Pc- D 

In the communication-free setting, each player p computes a function fp from his payoff matrix to 
a mixed strategy. We will first introduce a "commitment measure" that measures the variability 
of mixed strategies that may be selected by p, i.e. the image of the set of all payoff matrices under 

The variation distance between two probability distributions x and x' over [n], is half the sum 
of all positive differences between the two distributions, i.e. 

n ^ 

c^(x,x') = ^-|x[z] - X 

i=l 

For nxn games, let denote the set of strategies the row player may use (i.e. the image of fr) and 
0^ the set of strategies the column player may use. For each player we define his "centre strategy" . 
For the row player the strategy is the probability distribution such that the maximum distance 
between and any strategy u € il.'^ is minimised. 

= argmin sup d(c, u) 

The centre distribution c,^ of the column player is defined in a similar way. The commitment of 
the row player is defined as 

< = 1 - sup d{c'^,u) 

The commitment of the column player is defined similarly. This commitment measure will 
be a value in [0, 1] that indicates the variability of strategies a player may use, and is high when 
the player always plays strategies that are close to some "central" strategy. 

Theorem 2 For bimatrix games with payoffs in the range [0, 1], if each player independently com- 
putes a mixed strategy based on his own payoff matrix, then it is impossible to guarantee an e- 
approximate Nash equilibrium for e < 0- 501. 

Proof. The proof will be a case analysis on commitment. In the proof, our analysis is with respect 
to an arbitrary fixed value of n, so we drop the subscript n from the commitment values and r^, 
also the centre probability vectors and c^. We will show that for all n, the regret of a player is 
at least 0-501. We identify two cases: 

1. A player has a low commitment: r'' < 0- 05 or < 0- 05 

2. Neither player has a low commitment: 0- 05 < r*" and 0- 05 < t'^ 

Case [1} A player has low commitment 
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Assume the column player has low commitment, thus r'^ < 0-05. We use this low commitment to 
identify a set of strategies that are quite far apart from each other, under variation distance. 

For the column player, take an arbitrary strategy si G Because r*^ < 0- 05, there must be 
some strategy S2 with d{si, S2) > 0- 95, otherwise si could be the centre strategy c with > 0- 05. 

Now consider the strategy S12 = ^^^5^, thus (i(si2,si) = (i(si2,S2) = ^d{si,S2) < \. For this 
strategy not to be a centre strategy c contradicting t'^ < 0- 05, there must be some strategy S3 G ff^ 
with (i(si2, S3) > 0- 95. Because si constitutes half of the strategy S12, it holds that d(si, S3) > 0- 90 
and similarly (i(s2,S3) > 0-90. We have 

d(si, S2) > 0- 95; d(si, S3) > 0- 90; d(s2, S3) > 0- 90; (i(si2, S3) > 0- 95. 

The next step is to construct a n x n payoff matrix R of the row player. Only the first 3 rows of 
R will contain non-zero entries. The construction of rows 1,2,3 will be such that for i,j G {1, 2, 3}, 
row z is a best response to Sj and a poor response to Sj {j / i). 

For every column j of R determine the maximum of si[j], S2[j] and s^lj]. If si[j] is the largest, 
Rij = 1 and R2j = R-sj =0. If S2[j] is the largest, i?2j = 1 and Rij = R-^j = 0. If s^lj] is the 
largest, R^j = 1 and Rij = R2j = 0. In case of a tie in the comparison of si[j], S2[j] and S3[j], all 
the entries corresponding to the tie get a 1. 

Consider columns i for which i?2i = 1, so that S2[i] > si[i]. The total probability assigned 
by si to these columns is bounded by 0- 05. If the probability on these columns was higher than 
0- 05, it would follow that d{si, S2) < 0- 95. Similarly we can bound the probability assigned by si 
to columns i such that R^i = 1. Since d(si,S3) > 0- 9 this probability at most 0- 1. From these 
observations, we have that at most 0- 15 of the probability distribution Si is allocated to columns 
that could give a payoff of for row 1. Since each column of R contains at least one 1, the remaining 
0- 85 probability of Si will be allocated to columns that have a 1 in the corresponding entry of row 
1. The payoff for row 1 if the column player plays Si is therefore at least 0- 85. We can use a similar 
argument to claim that when the column player plays S2, the row player can get a payoff of at least 
0- 85 by playing pure strategy row 2, and at most 0- 05 for row 2, and at most 0- 1 for row 3. 

For row 3 we use d(si2, S3) > 0- 95. Consider columns i for which R^i = 0, so that either Ru = 1 
or i?2i = 1. A column i having this property, contributes > ^S3[i] to the overlap between S3 and 
S12. Indeed, if both Ru = 1 and i?2i = 1, it contributes s^li] to the overlap. So we can deduce 
that with respect to columns selected using S3, Pr(row 1 pays 1) + Pr(row 2 pays 1) < 0- 1. Again, 
since each column of R contains at least one 1, the remaining 0- 9 probability of S3 will be allocated 
to columns that have a 1 in the corresponding entry of row 3. The payoff for row 3 if the column 
player plays S3 is therefore at least 0- 9, while the payoffs to rows 1 and 2 sum to at most 0- 1. To 
summarise: 

• If the column player plays si, the row player gets a payoff of at least 0- 85 by playing row 1. 
Playing row 2 would give him a payoff of at most 0- 05 and playing row 3 a payoff of at most 
0-1. 

• If the column player plays S2, the row player gets a payoff of at least 0- 85 by playing row 2. 
Playing row 1 would give him a payoff of at most 0- 05 and playing row 3 a payoff of at most 
0- 1. 

• If the column player plays S3, the row player gets a payoff of at least 0- 9 by playing row 3. 
Playing row 2 would give him a payoff of at most 0- 1 and playing row 3 a payoff of at most 
0- 1. Moreover, the sum of payoffs of row 1 and row 2 is at most 0- 1. 

Given the row player's strategy, let (?'i,r2,r3) be the probabilities with which he plays rows 1,2,3. 
Assume ri < r2, r3, so ri < ^ and suppose the column player plays strategy si. The best response 
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strategy (1,0,0) has a payoff of a G [0-85, 1]. Because row 1 clearly gives the highest payoff, the 
regret is minimised when this row is played with as much probability as possible, so ri = |. Because 
the probability on row 1 was defined as the lowest probability, the probability on the other two 
rows is also |. This gives a regret of at least 

a - ( + ^(0- 05) + ^(0- 1) ) = - 0- 05 > ^(0- 85) - 0- 05 f« 0- 517 

The analysis for r2 < ri,r3 where the column player plays S2 is similar. 

Assume < ri,r2 and the column player plays S3. The best response to S3 has a payoff of at 
least 0- 9 and row 1 and 2 combined can have a payoff of at most 0- 1. This gives a regret of at least 

/I 1, A 2 1 2, , 1 

a - -a + -(0- 1) = -a > -(0- 9) ^ 0- 567 

V3 3^ 7 3 30-3^ ^ 30 

So regardless the strategy of the row player, the regret of the row player is always larger than 0- 501 
when the commitment of the other player is at most 0- 05. 

Case [2} Neither player has low commitment 

Suppose both players have commitment r^, > 0- 05. Consider the following set of payoff matrices 
for the column player: C^, . . . , where has a payoff of 1 for every entry in the i-th column 
and a elsewhere: 

: C^j = 1 if j = i; otherwise 

To achieve a 0- 501-approximate Nash equilibrium, when the column player has payoff matrix C^, 
the column player should assign at least 0- 499 to column i. 

The construction of the payoff matrix R of the row player will depend on the centre strategy c'' 
of the row player. Take the (n — ^/n) rows of R which have the highest values c'"[i], where c^[i] is 
the i-th entry of c*". We construct matrix R for which these rows are all zero. For the construction 
of the remaining ^yn rows of R we consider c^, the centre distribution of the column player. We 
select the (n — ^/n) columns j of R having the highest values [j] . If row i is one of the rows with 
one of the ^/n smallest entries for and column j is a column with one of the (n — y/n) highest 
entries for c^, then we set Rij = 1. The payoff entries in R that are still undefined can be seen as 
a {^/n X \/n)-sub-matrix. 

This submatrix will contain a submatrix M„/ as in Definition [H where (for k' = [-v/rj/J) (^,) = 
y/n. The extra columns of the submatrix have all their payoffs set to 1. The entire matrix R 
now satisfies the conditions of Corollary [TJ Let S be the set of non-zero rows; by construction 
E^es ^'[^ < ^- Since d(x, < 0- 95, we have Z^eS < 0- 05 + ^, so E.^s > 0- 05 - ^. 

Similarly y has measure > 0- 05 7= on columns not intersecting M„/. 

v n 

The values of pr and pc in Corollary [T] are 0- 05 — and the value of p is 0- 499, so we get a 
regret of at least 0- 499 - O(^) + (0- 05 - 0{^)f = 0- 5015 - 0{^). □ 

3 One-way Communication 

We noted in Section 11.2.11 that e = | can be achieved with one-way communication, by a simple 
implementation of the DMP-algorithm, using logarithmic communication. The following result 
gives a matching lower bound of ^. It thus also furnishes a slightly simpler lower-bound result 
for the communication-free setting of the previous section, but of course the lower bound itself is 
necessarily weaker. 
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Theorem 3 It is impossible to guarantee to find an e-Nash equilibrium, for any constant e < 
with unlimited one-way communication. 

Proof. We consider games G = {R, C), where R and C are payoff matrices with dimensions (^) x n, 
with k ~ ^/n. Consider the following set of column player payoff matrices C^, . . . , C", where 
has a payoff of 1 for every entry in the ^-th column and a elsewhere: 

\/i,j : Clj = 1 if j = £; otherwise 

The row player has matrix R = with M„ as in Definition [TJ 

Let X be the strategy of the row player, resulting from matrix R. Let be the strategy of the 
column player resulting from matrices R and C^; note that with unlimited one-way communication 
we can assume that the row player communicates all of R (and indeed, x) to the column player. 

We will show that for this class of games, one cannot do better than a — o(-^))-approximate 
Nash equilibrium. 

We search for a lower bound of ^ — z, and we identify that a value of z of applies. 

First observe that a best response for the column player having matrix is e^, the pure 
strategy of column £. Column £ has payoff 1 and other columns have payoff 0. So to reach a 
(| — z)-approximate Nash equilibrium, yi must allocate a probability at least + z) to column i. 

So, an arbitrary column I can be required to have probability at least ^ — z. Lemma [1] says 
that the row player's regret is at least \ — z — 0{-^). Put z = and we find that for all x, £ 

may be chosen such that in order for the column player to have regret less than i — 0{-^), the 

row player must have regret at least ^ — 0{-^). □ 

Theorem 4 It is impossible to guarantee to find an e- well- supported Nash equilibrium, for any 
constant e < 1, with unlimited one-way communication. 

Proof. To prove this theorem we will only have to look at 2 x 2 games. The row player has the 
identity matrix and the column player has one of two different column matrices. Communication 
is only allowed from the row player to the column player. 

In any e- well-supported Nash equilibrium for e < 1, the column player must play pure strategy 
column j, given payoff matrix CK That is necessary regardless of the information she receives from 
the row player. 

No communication is allowed from the column player to the row player, so the row player's 
strategy is determined by matrix R. Let f^{R) be the row player's strategy. If f^{R) allocates 
positive probability to row i, then we fail to have a e- well-supported Nash equilibrium (for any 
e < 1) when the column player has matrix C^~*, since when that happens, row i pays the row 
player while the other row pays 1. □ 

4 Communication-bounded algorithms 

This section present the main positive results, algorithms that compute approximate Nash equilibria 
that are limited to polylogarithmic communication. Section 14.11 gives the main result for e-Nash 
equilibria, and Section 14.31 gives a variation of the algorithm that compute e- well-supported Nash 
equilibrium for e ~ 0- 732. 
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4.1 A 0- 438-approximate Nash equilibrium procedure with hmited communica- 
tion 

This section provides a 0- 438-approximate Nash equihbrium procedure where the amount of com- 
munication between the players is polylogarithmic in n. We present the algorithm as an a- 
approximate Nash equilibrium procedure first and then optimize a. At various points the algorithm 
uses the operation of communicating a mixed strategy (a probability distribution over [n]) from 
one player to the other; the details of this operation are given in Section 14.21 The general idea is 
to communicate a sample of size O(logn) from the distribution and argue that the corresponding 
empirical distribution is a good enough estimate for our purposes. 

First the row player finds a Nash equilibrium for the zero-sum game {R, —R) and the column 
player computes a Nash equilibrium for the zero-sum game (— C, C). Since both games are zero- 
sum, we know that the payoff values for their Nash equilibria will be unique. Both players compare 
this payoff value with a. We distinguish two cases, 

1. neither player can ensure himself a payoff more than a, or 

2. at least one of the players can ensure a payoff more than a. 

With 0(1) communication, the case that holds can be identified. 

Case 1: the value of both zero-sum games is < a to each player 

The row player finds a strategy pair (x*, y*) as solution to (R, —R), while the column player finds a 
strategy pair (x*, y*) as solution to (— C, C). The row player communicates y* to the column player 
(as described in Section HT2]) and the column player sends x* to the row player. They now play the 
game {R,C) using strategy pair (x*,y*). Since y* is a Nash equilibrium strategy in the zero-sum 
game (i?, —R) and the row player still plays with payoff matrix R, by the minimax theorem, the 
row player has no strategy that can give him a payoff of a or higher. The row player has a best 
response with a value of at most a, so his regret is also at most a. The strategy x* was a Nash 
equilibrium strategy in the zero-sum game (— C, C) and the column player still has payoff matrix 
C. So we can use the same argument for the column player to claim that when the row player 
plays strategy x*, the column player has regret at most a. So, we have a a-approximate Nash 
equilibrium. This concludes Case 1. 

Case 2: one or both players can guarantee a payoff > a 

If at least one of the players has a value of more than a for his zero-sum game, he can get a payoff 
of more than a if he plays this strategy, regardless the strategy of the other player. Assume w.l.o.g. 
that it is the row player who has a payoff greater than a in his zero-sum game. He communicates 
this strategy x* to the column player (again, as described in Section 14. 2p . The column player 
identifies a pure strategy best response e^ to x* and communicates e^ to the row player (using log n 
bits). 

At this point in the algorithm we have the strategy pair (x*, e^). The column player has a best 
response strategy, so at this point his regret is 0. The row player's strategy x* is paying him more 
than a. Let /3 < 1 be the value of his best response to e^. So at this point the row player has a 
regret of at most (3 — a. We next deal with the possibility that (i — a> a. 

At this stage the column player has regret while we are only looking for regret to be bounded 
by a; meanwhile the row player has a strategy that might not be good enough for an a-approximate 
Nash equilibrium. To change this, we use a method used in [3] (Lemma 3.2), which allows the row 
player to shift some of his probability to his best response to e^. By shifting some of his probability, 
it could be that e^ no longer is a best response strategy for the column player. This is acceptable, 
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as long as the column player's regret while playing is at most a. Suppose the row player shifts 
\a oi his probability to a best response strategy. The payoff the column player gets with ej 
could be \a lower because of this move. The payoff of some other column(s) could go as much 
as higher because of this shift. The strategy had regret 0, so by the shift of of the row 
player's probability, the regret of the column player is at most \a + \a = a, which constitutes an 
a-approximate Nash equilibrium, for the column player. 

The row player is allowed to change the allocation of of his probability that was allocated 
to strategies having the lowest payoff. The remainder of his probability, 1 — ^a, had already at 
least an average payoff of a. The probability is shifted to his best response with a value of f3, with 
a < /3 < 1. The following inequality is a sufficient condition for the row player's regret to be at 
most a: 

1 --a] a + -a(3 > P - a , < a < (3 < 1 



-Aa 



The solutions to this inequality are 



< a < i(5 - ^/l7) a < fi < 
i(5 - VVJ) < a < 1 a < /3 < 1 

a = /3 = a = l /3 = 1 

where it holds that if a = ^(5 — \/l7) then f{a) = = 1 and for < a < 1 this function is 

monotone increasing. This procedure will give an a-approximate Nash equilibrium, so a should be 
as low as possible. Next to this it should also hold for every /3 with a < (3 < 1. The lowest a such 
that this condition hold is when /(a) = 1, thus a = ^(5 — VVJ) ~ 0- 438. 

So if the row player rearranges ^ • 0- 438 = 0- 219 of his probability to his best response row, 
both players have a strategy that guarantees them a 0- 438-approximate Nash equilibrium. 



4.2 Communicating Mixed strategies 

We describe how to communicate an approximation of the mixed strategies that are computed, 
using 0(log^ n) bits. We ultimately obtain an e of 0- 438 + 5, for any 5 > 0. 

We first look at the case where one of the players, assume w.l.o.g. the row player, has a payoff 
higher than a in the Nash equilibrium of his zero-sum game {R, —R). The column player plays a 
pure best response to the strategy of the row player, regardless of the support of the strategy of 
the row player. So we mainly consider the row player. 

The zero-sum game {R,—R) gives a strategy pair (x*,y*). Fix k = and form a multiset 
A by sampling k times from the set of pure strategies of the row player, independently at random 
according to the distribution x* . Let x' be the mixed strategy for the row player with a probability 
of ^ for every member of A. We want the distribution x' to have a payoff close to the payoff of x*. 
This corresponds to the following event: 

<^ = {((x'fiiy*)-((x*f/2y*)<-5} 

As noted in [27] the expression {{x.')'^ Ry*) is essentially a sum of k independent random variables 
each of expected value ((x*)"'"iiy*), where every random variable has a value between and 1. This 
means we can bound the probability that (p does not hold, which we will call (j). When we apply a 
standard tail inequality [22] to bound the probability of (p, we get: 
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With k = this gives Pr[(^] < and Pr[(/>] > 1 — If x' does not give payoffs close enough 
to X*, we sample again. 

The strategy x' has a guaranteed payoff of 0- 438 + 5 — 5 = 0- 438. This strategy is communicated 
to the column player. The support of this strategy is logarithmic and all probabilities are rational 
(multiples of ^). Communication of one pure strategy has a communication complexity of 0(log n). 
This will give a communication complexity for x' of 0(log^ n). 

The column player computes a pure strategy best response to x' and communicates this strategy 
in O(logn) to the row player. The strategy of the row player might not yet lead to a 0-438- 
approximate Nash equilibrium, his payoff could be too low. As we have seen before, if the row 
player redistributes at most 0- 219 of his probability, he is guaranteed to have a strategy that leads 
to a 0- 438-approximate Nash equilibrium. 

This change in strategy of the row player can decrease the payoff of the column player by as 
much as 0- 219 and increase another pure strategy by as much as 0- 219. His strategy was a best 
response, a 0-approximate Nash equilibrium, and the improvement to another pure strategy is 
maximal 0- 219 + 0- 219 = 0- 438, this leads to a 0- 438-approximate Nash equilibrium. 

In the alternative case, where both players have a low (< a) payoff in their zero-sum games, 
the technique is essentially the same: each player samples k times from the opposing distribution, 
checks that it limits his own payoff to at most a + 6, re-samples as necessary, and communicates 
the /c-sample. 

4.3 AO- 732- well-supported Nash equilibrium procedure with hmited commu- 
nication 

We give a variant of the algorithm of the previous section, that produces an e-well-supported Nash 
equilibrium for e = \/3 — 1. Like the previous algorithm, we will first search for an a-approximate 
Nash equilibrium and later find the optimal value for a. 

The algorithm starts in the same way as in Section 14.11 with both players computing the Nash 
equilibrium of zero-sum games. The row player solves the zero-sum game {R, —R) and the column 
player solves (— C, C). The two cases that arise are also the same; case 1 proceeds as in Section |4T] 
while Case 2 requires a variation to the algorithm. 

Case 1: the value of both zero-sum games is < a to each player 

First consider the case where both players have a Nash equilibrium with value smaller than a. 
The row player has a strategy pair (x* , y* ) and the column player a strategy pair (x* , y* ) . The 
row player communicates y* to the column player and the column player sends x* to the row 
player. They will now play the game with the strategy pair (x*, y*). If they play according to these 
strategies, then no pure strategy yields a payoff of a or more, so note that the strategy profile is 
an a-well-supported Nash equilibrium. 

Case 2: one or both players can guarantee a payoff > a 

Suppose that a player, assume w.l.o.g. the row player, has a payoff more than a in the Nash 
equilibrium of his zero-sum game {R,—R). Let the row player communicate this strategy x* to 
the column player. The column player computes a pure strategy best response e^ to x* and 
communicates this strategy to the row player. Because the row player had a payoff of at least a in 
the game {R, —R), he also has a payoff of at least a against ej. 

At this point in the algorithm we have a strategy pair (x*, Cj). The strategy of the column player 
is a best response to x*, so his strategy has regret 0. We have no guarantee on the performance of 
the row player's strategy, in the context of a well-supported Nash equilibrium. 
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As in the previous algorithm we allow the row player to shift some of his probability to his 
best response to ej. Note that if we shift of the probability of the row player, this ensures the 
column player's payoffs vary by at most a. 

Let the best response of the row player to ej have value f3 > a. The row player's payoff is a 
random variable x that takes values in [0, 1] with expectation E{x) > a, since x* is the security 
strategy for payoff matrix R. The maximum value x can take is f3. The algorithm takes all strategies 
for which the row player's payoff is less than /3 — a, and replaces any probability allocated to them 
by X*, to any strategy whose payoff is at least (3 — a, thus satisfying the conditions for the row 
player to also have an a-well-supported Nash equilibrium. 

We upper bound the probability Pr(x < (3— a) as follows. Subject to E[x) > a and max(x) = /3, 
this is maximised when x takes values /3 or /3 — a. Let p = Pr(x < /3 — a). Then 

E{x) < p{P - a) + (1 - p)P = -ap + /3. 

We have E{x) > a. Plugging that into the above, 

n • ^ /5 — a 
a<—ap + p, I.e. p< . 

a 

To ensure that the amount of probability shifted is at most p, is suffices to let < i-e. 

+ 2a — 2/5 > 0. This is satisfied by a = — 1 + t/1 + 2/3, so that the worst case value of /3 is 1, 
resulting in the claimed value of \/3 — 1 0- 732. 

5 Conclusions 

Our results raise some open problems, such as how good an approximation should be achievable 
in the communication-free setting, and how well we can do in the setting of limited (two-way) 
communication. Our communication-bounded algorithms are also based on algorithms that com- 
pute approximate equilibria in polynomial time, and it would be very interesting if further upper 
bounds on the communication complexity could be obtained for algorithms whose computational 
time was not known to be polynomial. Pastink [30] considers some related topics, including the 
communication required for approximate equilibria of games of fixed size. It may be that future 
work should address the issue of communication protocols where the players have an incentive to 
report their information truthfully. 

We believe that the communication-limited algorithm for 0- 438-approximate Nash equilibria 
is significant, also the 0- 501 lower bound in the communication-free setting, since in the context 
of searching for e-approximate Nash equilibria, e = 0- 5 frequently seems to arise as a limit on 
what is achievable. For example, if we search for approximate equilibria of constant support, the 
DMP-algorithm [9] achieves this for e = 0- 5, however, Feder et al. [13] show that for e < 0- 5, the 
support size may need to be logarithmic in n. (The corresponding logarithmic upper bound on the 
support size that may be needed, is due to [27] •) In a similar way, while Fictitious Play is known 
to guarantee to find e-approximate equilibria for e approaching 0-5 [5], it has also been established 
that e = 0- 5 is, in the worst case, a lower bound on the approximation quality attainable [l7j. And, 
as we find in Theorem [3l 0- 5 is also the best approximation that can be guaranteed when there 
is a restriction to one-way communication. Finally, Fearnley et al. jllj show that to find e-Nash 
equilibria with e > ^, a strictly smaller fraction of the payoffs of the game need to be checked, than 
is needed for certain smaller positive values of e. 

Acknowledgements: The first author thanks Sergiu Hart for useful discussions during the iAGT 
workshop in May 2011. 
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